Skip to content

Replace __builtin_ffs with std::countr_zero#827

Merged
laurynas-biveinis merged 1 commit into
masterfrom
rm-__builtin_ffs
Dec 7, 2025
Merged

Replace __builtin_ffs with std::countr_zero#827
laurynas-biveinis merged 1 commit into
masterfrom
rm-__builtin_ffs

Conversation

@laurynas-biveinis
Copy link
Copy Markdown
Collaborator

@laurynas-biveinis laurynas-biveinis commented Dec 5, 2025

Replace GCC/Clang intrinsic __builtin_ffs with C++20 std::countr_zero
in the generic fallback path of N4::find_child(). This improves MSVC
portability and consistency with 10+ other std::countr_zero uses in
the codebase.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • Refactor
    • Optimized internal index calculation for improved efficiency across architectures.
    • Simplified and consolidated boundary checks for more robust handling of edge cases.
    • Minor comment updates and reliability improvements with no changes to public interfaces.

✏️ Tip: You can customize this high-level summary in your review settings.

@laurynas-biveinis laurynas-biveinis self-assigned this Dec 5, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 5, 2025

Walkthrough

Replaces __builtin_ffs()-based index calculation with std::countr_zero()-based logic in art_internal_impl.hpp, adds an early contains_byte() zero-check returning child_not_found, simplifies the boundary check to result >= children_count, and adjusts returned index and comments accordingly.

Changes

Cohort / File(s) Summary
Bit-manipulation refactor
art_internal_impl.hpp
Replaced __builtin_ffs(...) >> 3 path with std::countr_zero(match) >> 3; added early match = contains_byte(...); if (match == 0) return child_not_found;; changed boundary check to if (result >= children_count); adjusted returned pair to use (result, &children[result]); updated comments and UB note.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify that countr_zero-based index matches previous ffs behavior across architectures.
  • Check off-by-one semantics around the changed index offset and the new boundary condition.
  • Validate the early contains_byte() zero-return path in all caller contexts.

Possibly related PRs

Poem

Bytes once scanned with FFS's art,
Now trailing zeros play their part.
A match checked early, bounds made neat,
Off-by-one errors meet defeat. 🎲

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: replacing __builtin_ffs with std::countr_zero in the codebase for improved portability.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rm-__builtin_ffs

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96ff513 and 35bc1b4.

📒 Files selected for processing (1)
  • art_internal_impl.hpp (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
art_internal_impl.hpp (3)
olc_art.hpp (2)
  • key_byte (1196-1196)
  • result (749-750)
art.hpp (1)
  • result (566-567)
mutex_art.hpp (1)
  • result (248-248)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (1)
art_internal_impl.hpp (1)

1510-1530: Generic find_child fallback: std::countr_zero usage is correct and UB-safe

The contains_bytestd::countr_zero(match) >> 3U mapping is correct for the documented bit patterns, the match == 0 guard avoids UB, and result >= this->children_count.load() correctly filters out unused/sentinel slots while keeping indexing within children capacity. This preserves the intended behavior of the old __builtin_ffs path and is consistent with other std::countr_zero uses in this file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude Bot commented Dec 5, 2025

Claude finished @laurynas-biveinis's task —— View job


Code Review: Replace __builtin_ffs with std::countr_zero

Todo List:

  • Read the modified file to understand the specific changes
  • Search for other uses of std::countr_zero in the codebase for consistency
  • Analyze the bit manipulation logic transformation for correctness
  • Check for potential edge cases, bugs, and security issues
  • Evaluate performance implications
  • Assess test coverage for this functionality
  • Provide comprehensive review feedback

✅ Overall Assessment: APPROVED

This is a well-executed refactor that improves portability and consistency. The logic transformation is mathematically correct and the implementation is sound.

📋 Detailed Review

Code Quality & Best Practices

  • Excellent documentation: The updated comment table clearly shows the bit manipulation mapping
  • Consistent pattern: Aligns with 10+ other std::countr_zero uses in the codebase (art_internal_impl.hpp:601, 736, 1483, 1505, 1904, 1926, 2057, 2237, 2270, 2304)
  • Clean refactor: Removes GCC/Clang-specific __builtin_ffs for standard C++20 std::countr_zero
  • Proper casting: Maintains type safety with appropriate static casts

Logic Correctness

The transformation from __builtin_ffs to std::countr_zero is mathematically sound:

Before (art_internal_impl.hpp:1516-1530):

  • __builtin_ffs(0x80)8 (1-indexed position of first set bit)
  • result = 8 >> 31
  • return result - 10 (final index)

After:

  • std::countr_zero(0x80)7 (count of trailing zeros)
  • result = 7 >> 30
  • return result0 (final index)

Both approaches correctly map the bit position to array index.

Critical Bug Prevention

  • UB prevention: Added explicit if (match == 0) check before std::countr_zero() to prevent undefined behavior with zero input
  • Boundary check update: Correctly simplified from (result == 0) || (result > children_count) to result >= children_count since result is now 0-indexed
  • Index calculation: Properly updated return from result - 1 to result to match the 0-indexed nature of std::countr_zero

Performance Considerations

  • No performance regression: std::countr_zero is typically compiled to the same efficient bit manipulation instructions (bsf/tzcnt on x86)
  • Consistent pattern: Uses the same >> 3U optimization pattern found throughout the codebase
  • Maintained optimization: Kept the boundary condition check instead of masking per the existing comment about benchmark regression

Security Concerns

  • No security implications: Pure mathematical transformation without security-sensitive operations
  • Defensive programming: The early zero-check prevents potential undefined behavior

Test Coverage ⚠️

Recommendation: The existing test suite in test/test_art.cpp provides comprehensive coverage for ART operations, which will exercise this find_child path. However, consider adding specific edge case tests for:

  • Empty key byte scenarios (match == 0)
  • Boundary conditions with children_count
  • All 4 key index positions (0-3) for N4 nodes

The change is well-covered by existing integration tests, but targeted unit tests would provide additional confidence.

Consistency Check

This change completes the migration to std::countr_zero - no remaining __builtin_ffs usage found in the codebase.


🎯 Recommendation: MERGE

This PR successfully achieves its goals:

  • ✅ Improves MSVC portability by removing GCC/Clang-specific intrinsics
  • ✅ Maintains consistency with existing std::countr_zero usage pattern
  • ✅ Preserves correctness through careful logic transformation
  • ✅ Prevents undefined behavior with proper zero-checking
  • ✅ No performance or security regressions

The refactor is mathematically sound, well-documented, and follows established patterns in the codebase.


@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.33%. Comparing base (19d2119) to head (35bc1b4).
⚠️ Report is 11 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #827      +/-   ##
==========================================
- Coverage   98.39%   98.33%   -0.07%     
==========================================
  Files          35       35              
  Lines        6480     6483       +3     
==========================================
- Hits         6376     6375       -1     
- Misses        104      108       +4     
Flag Coverage Δ
Debug 98.28% <ø> (-0.10%) ⬇️
Release 97.61% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace GCC/Clang intrinsic __builtin_ffs with C++20 std::countr_zero
in the generic fallback path of N4::find_child(). This improves MSVC
portability and consistency with 10+ other std::countr_zero uses in
the codebase.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Dec 6, 2025

@claude
Copy link
Copy Markdown

claude Bot commented Dec 6, 2025

Claude finished @laurynas-biveinis's task —— View job



Code Review: Replace __builtin_ffs with std::countr_zero

Todo List:

  • Read the modified file to understand the specific changes
  • Search for other uses of std::countr_zero in the codebase for consistency
  • Analyze the bit manipulation logic transformation for correctness
  • Check for potential edge cases, bugs, and security issues
  • Evaluate performance implications
  • Assess test coverage for this functionality
  • Provide comprehensive review feedback

✅ Overall Assessment: APPROVED

This is a well-executed refactor that improves portability and consistency. The logic transformation is mathematically correct and the implementation is sound.

📋 Detailed Review

Code Quality & Best Practices

  • Excellent documentation: The updated comment table clearly shows the bit manipulation mapping
  • Consistent pattern: Aligns with 10+ other std::countr_zero uses in the codebase (art_internal_impl.hpp:601, 736, 1483, 1505, 1904, 1926, 2057, 2237, 2270, 2304)
  • Clean refactor: Removes GCC/Clang-specific __builtin_ffs for standard C++20 std::countr_zero
  • Proper casting: Maintains type safety with appropriate static casts

Logic Correctness

The transformation from __builtin_ffs to std::countr_zero is mathematically sound:

Before (art_internal_impl.hpp:1516-1530):

  • __builtin_ffs(0x80)8 (1-indexed position of first set bit)
  • result = 8 >> 31
  • return result - 10 (final index)

After:

  • std::countr_zero(0x80)7 (count of trailing zeros)
  • result = 7 >> 30
  • return result0 (final index)

Both approaches correctly map the bit position to array index.

Critical Bug Prevention

  • UB prevention: Added explicit if (match == 0) check before std::countr_zero() to prevent undefined behavior with zero input
  • Boundary check update: Correctly simplified from (result == 0) || (result > children_count) to result >= children_count since result is now 0-indexed
  • Index calculation: Properly updated return from result - 1 to result to match the 0-indexed nature of std::countr_zero

Performance Considerations

  • No performance regression: std::countr_zero is typically compiled to the same efficient bit manipulation instructions (bsf/tzcnt on x86)
  • Consistent pattern: Uses the same >> 3U optimization pattern found throughout the codebase
  • Maintained optimization: Kept the boundary condition check instead of masking per the existing comment about benchmark regression

Security Concerns

  • No security implications: Pure mathematical transformation without security-sensitive operations
  • Defensive programming: The early zero-check prevents potential undefined behavior

Test Coverage ⚠️

Recommendation: The existing test suite in test/test_art.cpp provides comprehensive coverage for ART operations, which will exercise this find_child path. However, consider adding specific edge case tests for:

  • Empty key byte scenarios (match == 0)
  • Boundary conditions with children_count
  • All 4 key index positions (0-3) for N4 nodes

The change is well-covered by existing integration tests, but targeted unit tests would provide additional confidence.

Consistency Check

This change completes the migration to std::countr_zero - no remaining __builtin_ffs usage found in the codebase.


🎯 Recommendation: MERGE

This PR successfully achieves its goals:

  • ✅ Improves MSVC portability by removing GCC/Clang-specific intrinsics
  • ✅ Maintains consistency with existing std::countr_zero usage pattern
  • ✅ Preserves correctness through careful logic transformation
  • ✅ Prevents undefined behavior with proper zero-checking
  • ✅ No performance or security regressions

The refactor is mathematically sound, well-documented, and follows established patterns in the codebase.


@laurynas-biveinis laurynas-biveinis merged commit 4403320 into master Dec 7, 2025
238 of 239 checks passed
@laurynas-biveinis laurynas-biveinis deleted the rm-__builtin_ffs branch December 7, 2025 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant